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Abstract 

We describe lossless quantum compression of unknown mixtures (of non-orthogonal 
states) and give an expression of the optimal rate of compression. 

1 Introduction 

The aim of lossless quantum compression is to compress a mixture (of possibly non- 
orthogonal states) exactly and without error. It was previously thought D3E1E1I11 that 
compressing a mixture £ = {(pi, \ipi))} is impossible when the value i of the state \tpi) 
to be compressed is unknown. If £ is compressed using a variable length code, then \ipi) 
might be in an unknown superposition of different lengths in which case the number 
of qubits that the compressor should send to the decompressor cannot be determined. 

If lossless quantum compression was impossible, this would indicate another pro- 
found difference between classical and quantum information. If a fault tolerant im- 
plementation of quantum computation was found, then even if a mixture contained 
large amounts of redundancy, it could not be compressed without introducing errors. 
If lossless quantum compression was impossible then losing information would be an 
inherent feature of efficient quantum computations involving communication. 

In this paper, we show that lossless quantum compression is possible. In an "always 
open" model of communication, the decision "how many qubits to transmit" does not 
have to be taken. We show how to find the optimal rate of compression by looking at the 
probability that a state lies in a particular Hilbert space. This gives the optimal rate of 
compression of both known and unknown mixtures. Lossless quantum compression of 
unknown states is useful when the use of qubits has some cost. One example of a cost is 
the probability of decoherence when a mixture is passed through a noisy channel which 
disturbs each qubit independently with some probability. If the mixture is losslessly 
compressed, then the number of dimensions in which it lies is minimised, hence the 
probability that it is disturbed is minimised. 



2 Synopsis 

This paper is organised as follows. First we describe the background, including the 
previous work on lossless quantum compression and some definitions, the arguments 
why lossless quantum compression is impossible and an asynchronous model of quantum 
computation. Next we describe a model of communication in which lossless quantum 
compression can take place. We then prove the optimal rate of compression and show 
how it can be used to protect a mixture from being disturbed in the presence of noise. 
We conclude with ideas for future work. 
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3 Background 

Lossless classical compression is an everyday application for compressing files so that 
they can be stored more compactly on a hard drive or sent more efficiently over a chan- 
nel such as the internet. Lossless compression can be used when lossy compression can 
not, for example, in real-time applications where large blocks of data are unavailable. 
Classical lossless compression is also useful theoretically, for example, it gives the rel- 
ative entropy between two systems X and Y a simple interpretation as the additional 
expected number of bits used when X is compressed using the optimal compression 
code for Y (than if X had been compressed using the optimal compression code for X) 
The aim of lossless quantum compression is to compress a mixture £ = {(p;, IV^})} 
of quantum states using a variable length quantum code so that the original mixture 
can be retrieved exactly and without error. When the |^;}'s are orthogonal, this is 
equivalent to lossless classical compression (since we can rotate \ipi) round to \i) where 
\i) is in the computational basis). The challenge is therefore to encode £ when the 
\i>i)'s are non-orthogonal and the code words might have indeterminate lengths. 



3.1 Indeterminate Length Strings 

If we use a fixed length code to losslessly encode a mixture of states, we do not gain any 
compression. Suppose we use a variable length code represented by a unitary operation 
C to encode |0) as C|0) = |00) and |1) as C|l) = |111). Then (|0) + |1»/V2 is encoded 
as 

C ( \G) + \1) \ _ |00) + |111) (1) 
V y/2 J V2 
which does not have a determinate length. It is thus called an indeterminate length 
string. 

Definition 4 (Indeterminate Length String) = i s an indeterminate 

length quantum string if there exists i and j with \at\ > and |<x,| > and ^ 

Determinate length strings of length n exist in the Hilbert space H® n . Indeterminate 
length strings exist in the Fock space 

H ffl = 0ff® n (2) 

n 

Bostrom and Felbinger [I] defined two ways to quantify the lengths of indeterminate 
length strings. 

Definition 5 (Lengths of Indeterminate Length Strings) The base length L of 
an indeterminate length string is the length of the longest part of its superposition 



otM) I = max l(i) (3) 

\<*i\>0 



The average length I of an indeterminate length quantum string is the average length 
of its superposition 



(4) 



If we observe the length of a quantum string, then I gives us the expected length we 
observe and L gives us the maximum length we can observe. Given an indeterminate 
length string \ip), neither its average length nor its base length can be observed without 
disturbing it. 
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5.1 Can Indeterminate Length Strings be Used for Cod- 
ing? 

Various papers [3121111111 have described problems in using indeterminate length strings 
for lossless data compression. Braunstein et al pointed out three difficulties of data 
compression with indeterminate length strings. The first is that if the indeterminate 
length strings are unknown to both the sender and the receiver, then how can the 
time that different computational paths take be synchronised when computations are 
performed on the strings. The second difficulty is that if a mixture of indeterminate 
length strings are transmitted at a fixed speed, then the recipient can never be sure 
when a message has arrived and the strings can be decompressed. The third difficulty 
is that if the data compression performed by a read/write head (like a Turing machine), 
then after the data compression, the head location of the sender is entangled with the 
"lengths" of the indeterminate length string which represents the compressed data. 

Koashi and Nobuyuki [2] argued that it is impossible to faithfully encode a mixture 
of non-orthogonal quantum strings. They modelled lossless data compression as taking 
place on a hard disc with a maximum memory size of N qubits. A compressed state 
on the hard disc would be an unknown indeterminate length quantum string with base 
length L, in which case, only the remaining N — L qubits would be usable by other 
applications without disturbing the compressed state. However the base length L is 
not an observable, thus the other applications cannot determine how many qubits are 
available. Thus the remaining N — L qubits are not available for other applications to 
use unless L is the length of the longest code word. 

Schumacher and Westmoreland envisaged that indeterminate length quantum 
strings would be padded with zero's to create determinate length strings. Each code 
word of a variable length code would be padded with zero's so that each code word 
had the same length. They modelled the data compression as taking place between 
two parties Alice and Bob in which Alice sends Bob only the original strings (with the 
zero-padding removed) leaving Alice with a number of zero's depending the length of 
the string she sent. If she sends Bob an indeterminate length string, then after the 
transmission Alice and Bob are entangled. This is illustrated with an example. 

Example 6 (Lossless Quantum Compression with Zero-padding) If X is a pre 

fix free set given by X = {|0), |10), |110), |111)} then after zero-padding, X is trans- 
formed into the set X' = {|000), |100), 1 110) , | 111)}. Alice starts off with a zero-padded 
string Hilbert space spanned by X' . If she starts off with the state |000) or ] 100) , then 
she sends Bob |0) or 1 10) respectively and she is left with the state |0) or 1 00) respec- 
tively. If she starts off with the state (|000) + |100))/V2 then she sends (|0) + |10))/V2 
and is left with the state (|00) + \0})/V2. By measuring the state (|00) + \0))/V2, she 
can collapse the state (|0) + |10))/v / 2 which Bob has received. 

Since Alice can disturb the state she has sent to Bob, this scheme is an unsuitable 
model of lossless quantum data compression. 

Bostrom and Felbinger pointed out that quantum prefix strings are not useful. 
Classical prefix strings carry their own length information, however the length informa- 
tion indeterminate length prefix strings is unobservable without disturbing the string. 
They also considered zero-padding and said that in such a scheme an unknown indeter- 
minate length quantum string could not be transmitted because the number of zeros 
to remove before the string is transmitted cannot be determined without disturbing it. 

6.1 Properties of Indeterminate Length Strings 

Schumacher and Westmoreland [2] investigated the general properties of indeterminate 
length strings. An indeterminate length string can be padded with zeroes so that its 
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length becomes an observable. 

Definition 7 (Zero Extended Form) // |V) = £ 1<2 i m „ oti\i) is a quantum string 
in a register o// max qubits, then its zero- extended form is: 

IV-/) = £ a^Q® 1 ™^) (5) 

i<2'max 

Given a sequence of N strings, it is useful to be able to condense them so that the 
strings are packed together at the beginning of the string and the zero-padding all lies 
at the end of the sequence. 

Definition 8 (Condensable Strings) A set of strings £ is condensable if for any N , 
there exists a unitary operation U such that: 

t/flvL/) ® . . . ® |Vw)) = (IV 1 ) ® • ■ • ® IV^W (6) 

It is easy to see that superpositions of classical prefix free strings are condensable. 
Prefix strings were defined more generally. 

Definition 9 (Zero-Padded Prefix Free Strings) Suppose jV 1 ) and \tp 2 ) are quan- 
tum strings with L(|V )) > L(\i)j 2 )) and that they are in a register o/Zmax qubits. The 
first h qubits of \ip 2 e f) may be in a mixed state, described by the density operator 

pI"' 1 =*n 1+ i...w(lV4/» (7) 

IV 1 ) and |V 2 ) are prefix free if: 

(Vi... il |P2- il |Vi... il > (8) 
where IVL.Ji) denotes the first h qubits of |Vi) 's zero-extended form. 

9.1 From Lossless Coding to Lossy Coding 

Schumacher and Westmoreland demonstrated that by projecting onto approximately 
n(S(£) + 6) qubits, if a mixture £® n is encoded with a variable length condensable code, 
a fixed length lossy code can be obtained 0- If £ is a mixture with density operator 
p, where p's spectral decomposition is: 

p = -$>i|;>(i| (9) 

i 

Then £ can be encoded by encoding each \i) as a prefix free string of length |~— log(pi)] 
with zero-padding, p 8 " 1 can be encoded in the same fashion. Almost every string 
in the typical subspace of p®" has probability arbitrarily close to 2~ nS ' p ' as n grows 
large. Thus almost every string in the typical subspace of p is encoded as a string of 
length arbitrarily close to nS(p). By projecting onto n(S(p) + 5) qubits, we project 
onto the encoded typical subspace of p. We can decode the typical subspace to obtain 
the original mixture £ with arbitrarily high (but not perfect) probability and fidelity. 
Thus we can use a variable length code to design a lossy code. 

From this encoding, we can see that the average lengths of condensable codes obey 
Kraft's inequality (if they did not, then we could lossily compress a mixture to less 
than its von Neumann entropy). Since the base length of a string is bounded below its 
average length so Kraft's inequality also holds for the base lengths. 

Theorem 10 (Kraft's Inequality for Condensable Strings) If £ is a set of or- 
thogonal condensable strings then 

j2 2 - i(W) < 2 ~ m)) < 1 (io) 

Wen \i>)ez 
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10.1 Coding Information With Classical Side Channels 

Bostrom and Felbinger [3] gave a scheme for lossless quantum compression using clas- 
sical side channels. If £ = {(pi,\ipi))} is the mixture to be compressed, then they 
assume that the value of i is known to the compressor, Alice. If she encodes £ using a 
unitary operation U, then she sends the base length of the compressed string to Bob, 
the decompressor, through a classical side channel. She then sends L(C(\ipi))) qubits 
of \ipi)'s zero-extended form to Bob. Since the length of the encoded string is encoded 
classically, it is not necessary to use a prefix free code to encode the quantum part — 
thus C is unitary but not necessarily condensable. 

Rallan and Vedral @ gave another scheme for lossless quantum compression with 
classical side channels which does not use zero extended forms. They envisaged that 
the compressed state would be represented by photons — thus using a tertiary alphabet 
{|0), |1), |e)} where |e) denotes the absence of a photon and marks the end of the string. 
They assumed that the Alice has n copies of a mixture £ which she would like to send 
to Bob. In this scheme, Alice only sends Bob the value of n. This scheme has a nice 
physical interpretation. 

10.2 Physical Interpretations of Indeterminate Length Strings 

Bostrom and Felbinger [I] pointed out that variable length quantum strings can be 
realised in a quantum system whose particle number is not conserved. Rallan and 
Vedral described in detail an example system where the average length of a string 
can be interpreted as its energy. A Hilbert space H® n can be realised by a sequence of 
photons |(/>i)g). . .®\4> n ) in which \<f>i) represents exactly one photon with frequency Wj, 
The value of the qubit \4>i) is realised by the polarisation of its photon, either horizontal 
|0) or vertical |1). The absence of a photon can be represented by |e) which is orthogonal 
to |0) and We obtain indeterminate length strings by allowing the number of 
photons to exist in superposition. The frequency of each photon \<f>i) is chosen to be 
equal so that u>i ~ u for some value u). The energy in a superposition of photons is the 
average energy required to either create or destroy that superposition (hui per photon 
of frequency u). Thus the energy of an indeterminate length string of photons \4>) is 
proportional to its average length and is given by hujl(\(j))). In this interpretation, lossy 
data compression can be interpreted as the average energy required destroy a mixture 
£ since destroying a mixture is equivalent to sending it to the environment (which is 
another recipient). 

10.3 Asynchronous Model of Quantum Computation 

If quantum computers are to be used to solve classical problems efficiently, then it 
seems reasonable to demand that all the paths of a quantum Turing machine Q 
halt simaltaneously. However this demand raises various issues. If two strings with 
different halting times are input in superposition, then the resulting computation halts 
at a superposition of different times It is uncomputable to say whether an ar- 
bitrarily constructed quantum Turing machine halts at a deterministic time |l()j . A 
quantum Turing machine that halts is not unitary since it cannot be reversed after the 
computation has halted 

These issues were resolved by Linden and Popescu ^21 who described a quantum 
Turing machine augmented with an ancillary system in which computations take place 
after the Turing machine has "halted". The ancillary system can record the time since 
computation began so that the output is disentangled from the time at which it "halts". 
Thus there is a well-defined model of quantum computation in which computation paths 
halt at different times. 
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11 Communication Model for Lossless Quantum 
Compression 

We have discussed the arguments why lossless quantum compression is impossible pQ 
Now we describe how lossless quantum compression of unknown mixtures is 
possible by taking an appropriate model of communication. For quantum compression, 
there are two cases, the mixture to be compressed can be known or unknown. We show 
that the same holds for classical compression depending whether the decision on what 
data is to be compressed is made before or after the data has been read. 

11.1 Lossless Quantum Compression of Known Mixtures 

Bostrom and Felbinger |3| gave a scheme for losslessly compressing a mixture £ — 
{(PiA' t Pi))} m which the value i of the state to be compressed is known to the compres- 
sor, Alice. Thus she can deduce the base length L(\tpi)) of the string to be compressed 
and transmit this many qubits to Bob through a quantum channel and send L(\ipi)) 
through a classical channel. Since Bob knows the value of L{\ifji)), the compressed 
quantum states are not necessarily prefix free (i.e. condensable). 

However the string |$) = L(\ij)i))® \ipi) which represents the classical and quantum 
parts together is not prefix free. An example of a prefix free encoding is 

l^) = in«wWi*«»)io£(|Vi i »®|^ i > (ii) 

where l^^^^^OLQipi)) is sent through the classical side channel. (To find the 
length of \^f ), we find the length of the first contiguous sequence of l's followed by 
a 0. Then we read the next [log(I»(|^»)))] to find the length of Thus with a 

slight modification to Bostrom and Felbinger's scheme, we have a prefix free encoding 
of known quantum mixtures where the length of the encoded data can be read from 
the classical part. However, an important question remains open: "What is the rate of 
compression?". Since this scheme is based on one-one coding rather than prefix free 
encoding, the analysis of the compression rate is more tricky though there are known 
bounds between the two compression rates classically ^JElEl- Instead, we search for 
a prefix free quantum encoding and analyse its compression rate. This will also enable 
us to compress unknown mixtures. But first, we resolve the issues of compressing with 
unknown indeterminate length strings. 

11.2 How Much Memory Is Free? 

Koashi and Nobuyuki [2] modelled data compression as taking place on a computer 
where only N qubits of memory are available. Let C be a classical prefix code for 
a random variable X. A naive guess is that, on average, we can losslessly compress 
N/H(X) copies of X into a memory of N bits. Let £ max be the length of C's longest 
code word and let n be the number of copies of X we compress into the memory. 
Then, in the worst case, X n compresses to ni max bits. If n is chosen to be greater than 
A/Z max , then with some small probability of error, the compression fails. Thus we can 
compress X n losslessly only when n < A/i max . (If we want to perform a computation 
on the remaining bits, then if we decide the computation in advance of the compression, 
there are only N — ni max bits available.) 

Both classical and quantum lossless compression can be modelled in two ways, 
depending on whether the states to be compressed are known or unknown. 

Ad Hoc "Known" Compression Classically, once the value of X is known, it can 
be deduced how much memory is free — the compression takes place ad hoc in 
that the decision whether there is enough space free to compress is decided by 
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examining the memory at the time of compression. We find an analogous quantum 
situation when the value of £ is known. In which case we can use Bostrom and 
Felbinger's scheme to record classically the amount of the memory which has 
been used so that the amount of free space available is known. 

Reversible "Unknown" Compression The classical analogy of compressing a mix- 
ture when its value is unknown is describing the compression of a random variable 
X before its value is known. If Alice decides to compress X and then to perform 
a computation in the free memory, she has to assume the worst case compression 
rate in order for the combined compression-computation to be reversible. The 
same situation arises in the quantum case when the value of the mixture being 
compressed is unknown. 

In the unknown quantum case, only some branches of computation may fail through 
lack of memory, in which case the compressor cannot be sure whether the computation 
has succeeded at a later time. In the unknown classical reversible case, the compressor 
can measure at a later time with certainty whether a computation has failed through 
lack of memory. However, in advance of the compression, the compressor does not 
know whether the computation has failed. 

11.3 How Many Qubits to Transmit? 

If Alice has an unknown indeterminate length quantum string, how can she decide how 
many qubits to transmit to Bob? Consider the following example. 

Example 12 (Open and Closed Channels) Alice and Bob have mobile phones which 
they leave switched on all the time. Alice says to Bob that she will phone him at 7pm 
if she is available to have dinner. If Alice does not phone Bob at 7pm, Bob can deduce 
that Alice is not available to have dinner. Whenever the phones are switched on, the 
channel is open and information is being exchanged. 

Thus if there is a channel between Alice and Bob, then Alice and Bob are always com- 
municating |16j . We can represent an "open-closed" channel with a tertiary alphabet 
{0, l,e} where e denotes "no communication". If Alice sends Bob a string in such a 
channel, then there is no need to use a prefix code since the closure of the channel 
marks the end of the sequence. 

A model of an always open channel El is shown in Fig. Q We do not require 
an e no communication character since the channel is always open. If Alice wants to 
send Bob a sequence of condensable indeterminate length strings, she condenses them 
and sends the qubits one by one. By assuming that Bob's memory is also padded with 
zeroes, we avoid the entanglement issues described by Schumacher and Westmoreland 
pj]. As the transmission is taking place, Alice is free to add append additional unknown 
condensable quantum strings onto those being transmitted. However, neither Alice or 
Bob can measure whether a string has been transmitted. 

12.1 When Can Bob Decompress? 

Suppose Bob wants to decompress the strings as they arrive? How can he decide when 
to begin decompression? In the standard model of quantum computation [171 118j. 
computations begin and end at determinate times. However as Linden and Popescu 
showed |12j . there is a well-defined model of quantum computation where computations 
can begin and end at superpositions of different times. Like when compressing onto a 
finite memory, we have two cases, whether the mixture to be compressed is known or 
unknown which correspond to the two classical cases whether the protocol is decided in 
advance or ad hoc. If the mixture is unknown and Bob wants to perform a measurement 
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Transmission 
Cell 



Alice Bob 

Figure 1: A channel between Alice and Bob. At each time step, Alice can read or write 
from the transmission cell, then Bob can read and write from the transmission cell. Using 
this channel, Alice can reversibly send Bob a string. She condenses the string in the initial 
part of her memory and pads it with zeroes. The transmission cell and Bob's memory are 
initially prepared as zeroes. To send a message to Bob, at step i, Alice swaps the ith (qu)bit 
in her memory with the value in the transmission cell, then Bob swaps the value in the 
transmission cell with the ith bit in his memory. A string of base length I is transmitted in 
I steps. 

on the decompressed state, then he waits for the maximum possible time of transmission 
and decompression before making the measurement. Similarly, if a random variable X 
is classically compressed, then if the time of a measurement is decided before the value 
of X is known, then the measurement may fail. 

12.2 Lossless Quantum Compression of Unknown States 

We have described two situations for lossless quantum compression of a mixture £ = 
{(pi,\i>i))} of non-orthogonal states. When the mixture is known (i.e. the value of 
i is known to the compressor), Bostrom and Felbinger's scheme [I] can be used and 
the lengths of the encoded data are an observable. When the mixture is unknown, 
the mixture can be compressed and transmitted using a condensable code as shown 
in Fig. U The expected average length E ([(£)) of the optimal code is known to 
be approximately the von Neumann entropy of £ , but in order to keep a mixture of 
indeterminate length strings intact, the number of qubits of each string that need to 
be kept intact is its base length. However, in either the known or unknown case, the 
optimal rate of compression in terms of the base lengths is still open. We will show, 
by assigning probabilities to Hilbert spaces according to the probability that string 
lies in a space, that the optimal rate of compression can be found by finding the most 
probable Hilbert spaces first. 

13 Prefix Free Strings 

Lossless quantum compression makes use of prefix free strings. Schumacher and West- 
moreland defined prefix free quantum strings in terms of their zero-extended forms 
using the trace operator. It is simpler just to directly generalise the classical definition. 

Definition 14 (Prefix Free Quantum Strings) A string \4>) is the prefix of a string 
\tp) if there exists a string \x) with |(e|x)l — such that 

\(<f>xW\ > (12) 
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A set £ of quantum strings is prefix free if any two (not necessarily distinct) strings 
in £ are prefix free. 

Unlike deterministic classical strings, deterministic quantum strings can be prefixed 
by themselves. For example, 

l*> = ^ d3) 

is a prefix of itself. In classical information theory, the empty string e is not prefix free 
since it multiple copies of e are not uniquely decipherable. Superpositions of the empty 
string |e) are self-prefix since if \ip) = a\e) + /3\<j>) then = |q/3(0||0)|. 

As did Bostrom and Felbinger pi], we can define Hilbert spaces with prefix free 
bases. 

Definition 15 (Prefix Free Hilbert Space) A Hilbert space H is prefix free if it 
has a basis of prefix free strings. 

We check such Hilbert spaces are well-defined by showing that any orthogonal basis of 
a prefix free Hilbert space is prefix free. 

Theorem 16 (Prefix Hilbert Spaces are Well-defined) If H is a prefix Hilbert 
space which is the span of a sequence of prefix free strings £i, . . £„, then any orthog- 
onal basis for H is prefix free. 

Proof To show this holds, we show that any string in H is not a prefix of itself 
and that any two orthogonal strings \<f>) and in H are prefix free. 
Let |V>) be any string in H. Then can be expressed as 

i 

Let |x) be any quantum string with |{e|x)| = 0- Then 



KV#x>l 



(15) 



Since the |fi)'s form a prefix free set, |(&|£ix)| — for all i and j, hence 

\Wi>x)\ = o (16) 

and \ip) is not a prefix of itself. 

Now we show that any two orthogonal strings \<j>) and \ip) in H are prefix free. 
Again, let \x) with |(e|x)l — 0. We can express \<f>) and \rp) as 

i^) = E*i^) ( 17 ) 

i 

\4>) = X>|&) (18) 

3 

Then using the prefix free property of the |^i)'s, 



\{<i>xm 



(19) 



= (20) 

which completes the proof. 
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Prefix free Hilbert spaces can be placed side by side so that their elements are con- 
densable. 

Theorem 17 A set of strings in a prefix free Hilbert space is condensable. 

Proof Let {£;}; be a basis for a prefix free Hilbert space H. Let Z max be the length 
of the longest base length of a string in H (i.e. the size of the register). For each 
|V>) G H, let \^ zef ) be its zero-extended form (so that is padded out with zeros to 
form a string of determinate length i ma x). Given an integer n, we can design a unitary 
operation U n on the basis vectors so that 

^r / )---i^ e/ )) = d6)®---iur / (2i) 

U is reversible and hence unitary. U condenses any set of strings £ drawn from H. 

Since any set of orthogonal prefix free strings are condensable, their average and 
base lengths obey Kraft's inequality If Alice wants to send Bob a sequence of 
strings from prefix free Hilbert spaces, she can condense them and send them as shown 
in Fig [U 

18 Lossless Quantum Data Compression 

The aim of lossless quantum data compression is, using as few qubits as possible, to 
encode a mixture of non-orthogonal states. When the states are orthogonal, the mixture 
can be encoded using determinate length strings and the rate of compression is simply 
the von Neumann entropy of the mixture. When the states are non-orthogonal, the 
expected length of the encoding is the expected base length of the compressed strings, 
since this is the minimum number of qubits that must be left intact for the mixture to 
be retrievable exactly and without error. 

Definition 19 (Lossless Quantum Code) Let £ = {pi, \4>i)}i be a mixture of quan- 
tum states in a Hilbert space H . A lossless code C is a unitary operation from H to a 
prefix free Hilbert space H . If B is an orthogonal basis for H then C(B) is a set of 
code words for C. 

The expected length of compression of C is: 

E(L(C(£))) = '£p i L(C(^ i ))) (22) 

i 

C is optimal if for any other code C' , 

E(L(C(£)) < E(L(C(£))) (23) 

An example of lossless quantum compression is shown in Fig. In analysing lossless 
codes, it is convenient to define probability in terms of subspaces. The idea is to encode 
small subspaces with high probability using short codes. We define the probability 
P(X) of a subspace X to be the total probability of all strings lying completely within 
X. We can share the probability of X equally between its basis vectors and define the 
average probability P of X to be P{X)/ dim(X). If a subspace Y has a large average 
probability, then it can be encoded with short strings. The average probability of 
another subspace X might be very small, but if there is a reasonably large probability 
that strings lie in the space X®Y , then we can encode X with reasonably short strings 
so that the strings that lie both in X and Y are encoded with a reasonably small base 
length. We define the average probability P of a subspace X with respect to a subspace 
Y to be the probability that a string lies partially in X given that it lies completely 
within X © Y divided by the dimension of X. 
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p(|2»+p(|<^» 



1) 



with probability p a state lies 
in the |0)-|1) plane 



|0> 



Figure 2: An example of lossless quantum compression. With high probability, say p, a 
state lies in the |0)-|1) plane. We say that the probability of the plane is p. Since the plane 
is spanned by two vectors, |0) and |1), we say that the average probability of the plane is p/2 
and encode the plane as strings of length — log(p/2) by encoding |0) and |1) as strings each 
of length — log(p/2). There are two non-orthogonal strings |2) and \<f>) which are outside the 
|0)-|1) plane, suppose the probability of these two strings sums to q < p/2. Then we can 
encode |2) as a string of length — log(g), in which case, the string |2) is encoded as a string 
of determinate length |2). The other string \<f>) is encoded as a string in a superposition of 
lengths — log(p/2) and — log(g), so that its base length is — \og{q). Since we encode in this 
way, we say that the probability of |2) with respect to the |0)-|1) plane is q. 



Definition 20 (Subspace Probabilities) Let £ = {{pi, \ipi))} be a mixture of quan- 
tum states in the Hilbert space H. If X is a subspace of H , then the probability P and 
average probability P of X are 

p(x) = y, p* ( 24 ) 

IV>i>£X 

P W = ^g§y (25) 

The probability P(X : Y) of a subspace X with respect to a subspace Y is the sum of 
the strings in the space X © Y which are partially within X. The average probability 
P(X : Y) of a subspace X with respect to a subspace Y is the sum of the strings in the 
space X © Y which are partially within X divided by the dimensions of X. 

P(X:Y) E Vi (26) 

|ft)exey and U'd^Y 

= ^(xj (27) 

A space H might contain some subspaces which have higher average probabilities 
than others. We can decompose H into its subspaces by finding the largest subspace Xi 
which has highest average probability first, then finding the largest subspace X2 which 
has highest average probability with respect to X\, then finding the largest subspace 
X3 which has highest average probability with respect to X\ and X2 and so on. Let Pi 
be a projection onto the subspace Xi. We can define a density operator by summating 
these projections where the eigenvalues for Pi are given by the average probability of 
Xi with respect Xx...t-i = Xi © . . . © Xi-i. 

Definition 21 (Decompositions of a Hilbert space by mixture ) Let£ = {(j>i, \^i))} 
be a mixture of quantum states in the Hilbert space H . We define subspace decomposi- 
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tion of H as Xi, X%, . . ., X m where Xi is defined as: 

Xi = X : P(X) > P(X') VX' C H and X' / X (28) 
Xi+i = X :P(X) >P(X' :I L .,,) 

VX' CH-(X!... m ) andX'^X (29) 

where Xi,, A — Xi © ... © Xi. Let Pi be the projection onto Xi. Then the density 
operator decomposition of H is the density operator p defined as: 

p = Y^P(X i :Xi... i - 1 )P i (30) 

i 

The probability each string \ipi) is only counted once so the density operator de- 
composition has trace 1. 

P(Xi : Xi © . . . © Xi-i) dim(X») = 1 (31) 

i i 

Since the subspaces are orthogonal to one another, we can use the converse of Kraft's 
inequality to encode each basis vector of each Xi as a prefix string of length of 
\\og(P(Xi : Xi © . . . © Xi-i)J\ . We now show that this is the optimal encoding. 



Theorem 22 (Noiseless Coding Theorem for Lossless Quantum Codes) Let £ = 

{(j>i, IV^))} be a mixture of quantum states in the Hilbert space H . Let X\, . . ., X m be 
the decomposition of H with density operator 

p = Y^P{Xi'Xi...i-i)Pi (32) 

i 

Let Zi = ©j. 2 i+i<p(x Xx i)<2< (Z\ is the space of strings that are encoded as 
strings of length I). Then there is a prefix free code C such that for all \if}) G Z\, 

L(C(\i>})) < I (33) 

where the expected length of C is bounded by: 

S{ P ) < E{L(C{S))) < S{p) + 1 (34) 
and for any other prefix code C : 

E(L(C(£))) < E(L(C ■(£))) + 1 (35) 

Proof Let C' be any prefix free lossless quantum code on H. The proof proceeds as 
follows. 

• First we show by induction that if C is a prefix code then H can be divided up 
into orthogonal subspaces Z[ which are encoded with base length I. 

• Next we show that if C is optimal, then the average probability of Z[ with respect 
to Z'x...i-\ is about 2~ l . 

• We show by induction that if C is optimal then Z[ C Z\,_.i for all / which shows 
that for any |^>, L(C'(|^») > 

Let Z\ be the set of strings l^) £ H such that L(C (\ip))) = 1. Then Z\ forms 
a subspace as any string of base length 1 has determinate length 1. Let Zi +1 be the 
set of strings \ip) £ H — (Zj,,,i) such that L(C'(\ip})) = 1 + 1. Then assuming that 
Zi, . . ., Zi form subspaces, so does Zi + i since if \ipi) and l^) are two strings in Zi +L 
with L(a\tpi) + /3|V>2>) < / + 1 then a\ipi) + P\ip2) £ Z k where k < I + 1. Since, by 
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our assumption, Zu is a subspace, if a\ip\) + /3\ip2) £ Zt then \tpi) and \ip2) are not in 
H — [Z\ ® ■ ■ ■ ® Zi) and hence not in Z t+ i . 

Now we use Shannon's noiseless coding theorem for lossless codes to show that 
P(Z'i : ~ 2~ ! if C' is optimal. The expected rate of encoding of C 1 is: 

E(L(C'(S))) = Y. T ( Z ' : Zi...»-i)dim(20l (36) 

z 

Since C' is prefix free, we have J2 l diia(Z' l )2~ l < 1. Thus according to Shannon's 
noiseless coding theorem for lossless codes, E(L(C'(£))) is minimal to within one qubit 
if C is chosen so that Z[ is encoded using strings of length [*— \c*%,(P(Z[ : 
in other words if for each I, 

L-log(P(Z; : ZL.i-i))J < i < [-log(P(^ : Z' x .. x _ x ))-\ (37) 

Thus the encoding of C is much like the encoding of C except that the subspaces could 
be chosen differently. 

We now assume that for each I: 

l=L-log(W:-ZLi-i))J (38) 

this only changes the expected length of compression of C' by one qubit and show by 

induction that this implies that Z{ C Z1...1. Z[ C Z 1 since Z\ is chosen to contain all 

the subspaces of average probability at least 1/2. Assuming that Z' k C Z\,,, k for all 

k < I, for any subspace X, P{X : Z[ 1) < P(X : Z\...i). If Z' l+1 was not in 

then would contain a subspace X such that P(X : Z[„j) > l/2 i+1 but this is a 

contradiction. 

Since for each I, Z[ C if L(C"(|V>») = Z then L(C(|V))) < /. To prove this we 

have assumed that 

Z=L-log(P(^':^...i- 1 ))J (39) 

However, if C' is optimal from Eq. E| the expected length of C' could be one qubit 
less. Thus C is optimal to within 1 qubit. 

23 Using Variable Length Compression in Noisy 
Channels 

Bostroem and Felbinger pi] suggested there may be a relationship between error cor- 
rection and variable length coding. The optimal code compresses highly probable sub- 
spaces so that the expected number of dimensions in which a string lies is minimised. 
Suppose an encoded mixture was sent through a noisy channel which introduces errors 
each qubit independently. Then by minimising the expected number of dimensions in 
which the mixture lies, the probability of its disturbance is minimised. An alternative 
is to variable length code in the diagonal basis of the mixture's density operator, in 
which case the probability of decoherence might be higher, but the expected proba- 
bility of being able to distinguish the initial and final states is smaller. We provide a 
simple example to illustrate the differences between the two schemes. 

Example 24 (Lossless Quantum Compression to Prevent Noise) Let£ be a mix- 
ture of quantum states with probabilities: 

P(y g^g |0)+Vj|l)) = 1/2 (40) 
P(^/(^^6)\0} - V6\l}) = 1/2 (41) 



13 



Then the density operator p which represents the mixture £ is: 

P=(l-5)|0)<0|+5|1>{1| (42) 

We now consider compressing £ using the base length compression scheme described in 
the previous section and compressing £ in the diagonal basis of its density operator so 
that it's average length is minimised. We assume that each qubit in the encoded state 
is disturbed with probability p. 

Base Length Compression If£ is encoded to minimise its expected base length, then 
the two strings are encoded as strings of determinate length 1. The probability that 
the encoded mixture is disturbed is p. 

Average Length Compression If £ is encoded in the diagonal basis, then |0) is 
encoded as a string of average length — log(l — S) and |1) is encoded as a string 
of average length — log(5) (assuming e.g. that there are a large number of copies 
so that the strings can be encoded with non-integer lengths on average). When S 
is small, each state in the mixture is a superposition of a very short string with 
high amplitude and a very long string with small amplitude. The probability that 
the whole state is disturbed is very small (p~ los ^ 1 ~ s ^J hut with large probability 
(p~ loB ( s> ) the state suffers a very small disturbance. 

Base length compression minimises the probability that a state is disturbed at all 
whereas average length compression minimises the probability that decompressed state 
can be distinguished from the original. In either case, error correction can be used 
to amplify the probability that the mixture is not disturbed. When applied to the 
base length compression scheme, it amplifies the probability that the mixture is left 
completely intact. 

25 Conclusions 

We have given a model of communication for lossless quantum compression and shown 
how to find the optimal code and rate. We now describe avenues for future work. 

25.1 Converse of Kraft's Inequality for Average Lengths 

The converse of Kraft's inequality for non-integer average lengths of indeterminate 
length strings is still open. If < p < 1 then there is no string of average length — log(p) 
since superpositions of the empty string |e) are not prefix free. It still remains open to 
find, for example, if there are three orthogonal strings of average length — log 2 (3). 

25.2 Bounds on Known Lossless Quantum Compression 

We did not show how to bound the rate of Bostrom and Felbinger's lossless compression 
scheme by the rate of prefix free lossless quantum compression. It is likely that the 
relationship can be found by looking at the relationships between one-one coding and 
prefix coding for classical codes [1411131 [Tfi] . 

25.3 Applications of Lossless Quantum Compression 

Many open problems in quantum information theory |17l I18j . such as entanglement 
catalysis \F§\, are phrased as "Can this state be transformed into that state exactly 
and without error subject to these conditions?". Maybe lossless quantum compression 
could be applied to solve some of these problems. Bostrom and Felbinger j3j pointed out 
there might also be interesting applications of variable length compression in quantum 
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cryptography to securely transfer data. We guess that if the base length compression is 
used it minimises the probability that Eve learns any information whereas the average 
length compression minimises the average amount of information that Eve learns. 



25.4 Lossy Compression of a Mixture of Mixtures 

A well-known open problem in quantum information is to find the (lossy) 

compression rate of a mixture of mixtures (this problem is related to the Holevo bound). 
It might be simpler to find the compression rate in terms of the average length of a 
variable length code [3] rather than in terms of Schumacher coding. 
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